---
# id: using-custom-llms
title: Using Custom LLMs for Evaluation
sidebar_label: Using Custom LLMs for Evaluation
---

<head>
  <link
    rel="canonical"
    href="https://deepeval.com/guides/guides-using-custom-llms"
  />
</head>

All of `deepeval`'s metrics uses LLMs for evaluation, and is currently defaulted to OpenAI's GPT models. However, for users that don't wish to use OpenAI's GPT models and would instead prefer other providers such as Claude (Anthropic), Gemini (Google), Llama-3 (Meta), or Mistral, `deepeval` provides an easy way for anyone to use literally **ANY** custom LLM for evaluation.

This guide will show you how to create custom LLMs for evaluation in `deepeval`, and demonstrate various methods to enforce valid JSON LLM outputs that are required for evaluation with the following examples:

- Llama-3 8B from Hugging Face `transformers`
- Mistral-7B v0.3 from Hugging Face `transformers`
- Gemini 1.5 Flash from Vertex AI
- Claude-3 Opus from Anthropic

## Creating A Custom LLM

Here's a quick example on a custom Llama-3 8B model being used for evaluation in `deepeval`:

```python
import transformers
import torch
from transformers import BitsAndBytesConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

from deepeval.models import DeepEvalBaseLLM


class CustomLlama3_8B(DeepEvalBaseLLM):
    def __init__(self):
        quantization_config = BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_compute_dtype=torch.float16,
            bnb_4bit_quant_type="nf4",
            bnb_4bit_use_double_quant=True,
        )

        model_4bit = AutoModelForCausalLM.from_pretrained(
            "meta-llama/Meta-Llama-3-8B-Instruct",
            device_map="auto",
            quantization_config=quantization_config,
        )
        tokenizer = AutoTokenizer.from_pretrained(
            "meta-llama/Meta-Llama-3-8B-Instruct"
        )

        self.model = model_4bit
        self.tokenizer = tokenizer

    def load_model(self):
        return self.model

    def generate(self, prompt: str) -> str:
        model = self.load_model()

        pipeline = transformers.pipeline(
            "text-generation",
            model=model,
            tokenizer=self.tokenizer,
            use_cache=True,
            device_map="auto",
            max_length=2500,
            do_sample=True,
            top_k=5,
            num_return_sequences=1,
            eos_token_id=self.tokenizer.eos_token_id,
            pad_token_id=self.tokenizer.eos_token_id,
        )

        return pipeline(prompt)

    async def a_generate(self, prompt: str) -> str:
        return self.generate(prompt)

    def get_model_name(self):
        return "Llama-3 8B"
```

There are **SIX** rules to follow when creating a custom LLM evaluation model:

1. Inherit `DeepEvalBaseLLM`.
2. Implement the `get_model_name()` method, which simply returns a string representing your custom model name.
3. Implement the `load_model()` method, which will be responsible for returning a model object.
4. Implement the `generate()` method with **one and only one** parameter of type string that acts as the prompt to your custom LLM.
5. The `generate()` method should return the generated string output from your custom LLM. Note that we called `pipeline(prompt)` to access the model generations in this particular example, but this could be different depending on the implementation of your custom model object.
6. Implement the `a_generate()` method, with the same function signature as `generate()`. **Note that this is an async method**. In this example, we called `self.generate(prompt)`, which simply reuses the synchronous `generate()` method. However, although optional, you should implement an asynchronous version (if possible) to speed up evaluation.

:::caution
In later sections, you'll find an exception to rules 4. and 5., as the `generate()` and `a_generate()` method can actually be rewritten to optimize custom LLM outputs that are essential for evaluation.
:::

Then, instantiate the `CustomLlama3_8B` class and test the `generate()` (or `a_generate()`) method out:

```python
...

custom_llm = CustomLlama3_8B()
print(custom_llm.generate("Write me a joke"))
```

Finally, supply it to a metric to run evaluations using your custom LLM:

```python
from deepeval.metrics import AnswerRelevancyMetric
...

metric = AnswerRelevancyMetric(model=custom_llm)
metric.measure(...)
```

**Congratulations 🎉!** You can now evaluate using any custom LLM of your choice on all LLM evaluation metrics offered by `deepeval`.

## More Examples

### Azure OpenAI Example

Here is an example of creating a custom Azure OpenAI model through langchain's `AzureChatOpenAI` module for evaluation:

```python
from langchain_openai import AzureChatOpenAI
from deepeval.models.base_model import DeepEvalBaseLLM

class AzureOpenAI(DeepEvalBaseLLM):
    def __init__(
        self,
        model
    ):
        self.model = model

    def load_model(self):
        return self.model

    def generate(self, prompt: str) -> str:
        chat_model = self.load_model()
        return chat_model.invoke(prompt).content

    async def a_generate(self, prompt: str) -> str:
        chat_model = self.load_model()
        res = await chat_model.ainvoke(prompt)
        return res.content

    def get_model_name(self):
        return "Custom Azure OpenAI Model"

# Replace these with real values
custom_model = AzureChatOpenAI(
    openai_api_version=openai_api_version,
    azure_deployment=azure_deployment,
    azure_endpoint=azure_endpoint,
    openai_api_key=openai_api_key,
)
azure_openai = AzureOpenAI(model=custom_model)
print(azure_openai.generate("Write me a joke"))
```

When creating a custom LLM evaluation model you should **ALWAYS**:

- inherit `DeepEvalBaseLLM`.
- implement the `get_model_name()` method, which simply returns a string representing your custom model name.
- implement the `load_model()` method, which will be responsible for returning a model object.
- implement the `generate()` method with **one and only one** parameter of type string that acts as the prompt to your custom LLM.
- the `generate()` method should return the final output string of your custom LLM. Note that we called `chat_model.invoke(prompt).content` to access the model generations in this particular example, but this could be different depending on the implementation of your custom model object.
- implement the `a_generate()` method, with the same function signature as `generate()`. **Note that this is an async method**. In this example, we called `await chat_model.ainvoke(prompt)`, which is an asynchronous wrapper provided by LangChain's chat models.

:::tip
The `a_generate()` method is what `deepeval` uses to generate LLM outputs when you execute metrics / run evaluations asynchronously.

If your custom model object does not have an asynchronous interface, simply reuse the same code from `generate()` (scroll down to the `Mistral7B` example for more details). However, this would make `a_generate()` a blocking process, regardless of whether you've turned on `async_mode` for a metric or not.
:::

Lastly, to use it for evaluation for an LLM-Eval:

```python
from deepeval.metrics import AnswerRelevancyMetric
...

metric = AnswerRelevancyMetric(model=azure_openai)
```

:::note
While the Azure OpenAI command configures `deepeval` to use Azure OpenAI globally for all LLM-Evals, a custom LLM has to be set each time you instantiate a metric. Remember to provide your custom LLM instance through the `model` parameter for metrics you wish to use it for.
:::

### Mistral 7B Example

Here is an example of creating a custom [Mistral 7B model](https://huggingface.co/docs/transformers/model_doc/mistral) through Hugging Face's `transformers` library for evaluation:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from deepeval.models.base_model import DeepEvalBaseLLM

class Mistral7B(DeepEvalBaseLLM):
    def __init__(
        self,
        model,
        tokenizer
    ):
        self.model = model
        self.tokenizer = tokenizer

    def load_model(self):
        return self.model

    def generate(self, prompt: str) -> str:
        model = self.load_model()

        device = "cuda" # the device to load the model onto

        model_inputs = self.tokenizer([prompt], return_tensors="pt").to(device)
        model.to(device)

        generated_ids = model.generate(**model_inputs, max_new_tokens=100, do_sample=True)
        return self.tokenizer.batch_decode(generated_ids)[0]

    async def a_generate(self, prompt: str) -> str:
        return self.generate(prompt)

    def get_model_name(self):
        return "Mistral 7B"

model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1")
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1")

mistral_7b = Mistral7B(model=model, tokenizer=tokenizer)
print(mistral_7b.generate("Write me a joke"))
```

Note that for this particular implementation, we initialized our `Mistral7B` model with an additional `tokenizer` parameter, as this is required in the decoding step of the `generate()` method.

:::info
You'll notice we simply reused `generate()` in `a_generate()`, because unfortunately there's no asynchronous interface for Hugging Face's `transformers` library, which would make all metric executions a synchronous, blocking process.

However, you can try offloading the generation process to a separate thread instead:

```python
import asyncio

class Mistral7B(DeepEvalBaseLLM):
    # ... (existing code) ...

    async def a_generate(self, prompt: str) -> str:
        loop = asyncio.get_running_loop()
        return await loop.run_in_executor(None, self.generate, prompt)
```

Some additional considerations and reasons why you should be extra careful with this implementation:

- Running the generation in a separate thread may not fully utilize GPU resources if the model is GPU-based.
- There could be potential performance implications of frequently switching between threads.
- You'd need to ensure thread safety if multiple async generations are happening concurrently and sharing resources.

:::

Lastly, to use your custom `Mistral7B` model for evaluation:

```python
from deepeval.metrics import AnswerRelevancyMetric
...

metric = AnswerRelevancyMetric(model=mistral_7b)
```

:::tip
You need to specify the custom evaluation model you created via the `model` argument when creating a metric.
:::

### Google VertexAI Example

Here is an example of creating a custom Google's [Gemini](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/model-versioning#stable-version) model through langchain's `ChatVertexAI` module for evaluation:

```python
from langchain_google_vertexai import (
    ChatVertexAI,
    HarmBlockThreshold,
    HarmCategory
)
from deepeval.models.base_model import DeepEvalBaseLLM

class GoogleVertexAI(DeepEvalBaseLLM):
    """Class to implement Vertex AI for DeepEval"""
    def __init__(self, model):
        self.model = model

    def load_model(self):
        return self.model

    def generate(self, prompt: str) -> str:
        chat_model = self.load_model()
        return chat_model.invoke(prompt).content

    async def a_generate(self, prompt: str) -> str:
        chat_model = self.load_model()
        res = await chat_model.ainvoke(prompt)
        return res.content

    def get_model_name(self):
        return "Vertex AI Model"

# Initialize safety filters for vertex model
# This is important to ensure no evaluation responses are blocked
safety_settings = {
    HarmCategory.HARM_CATEGORY_UNSPECIFIED: HarmBlockThreshold.BLOCK_NONE,
    HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,
    HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_NONE,
    HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_NONE,
    HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_NONE,
}

#TODO : Add values for project and location below
custom_model_gemini = ChatVertexAI(
    model_name="gemini-1.0-pro-002"
    , safety_settings=safety_settings
    , project= "<project-id>"
    , location= "<region>" #example : us-central1
)

# initialize the  wrapper class
vertexai_gemini = GoogleVertexAI(model=custom_model_gemini)
print(vertexai_gemini.generate("Write me a joke"))
```

To use it for evaluation for an LLM-Eval:

```python
from deepeval.metrics import AnswerRelevancyMetric
...

metric = AnswerRelevancyMetric(model=vertexai_gemini)
```

### AWS Bedrock Example

Here is an example of creating a custom AWS Bedrock model through the `langchain_community.chat_models` module for evaluation:

```python
from langchain_community.chat_models import BedrockChat
from deepeval.models.base_model import DeepEvalBaseLLM

class AWSBedrock(DeepEvalBaseLLM):
    def __init__(
        self,
        model
    ):
        self.model = model

    def load_model(self):
        return self.model

    def generate(self, prompt: str) -> str:
        chat_model = self.load_model()
        return chat_model.invoke(prompt).content

    async def a_generate(self, prompt: str) -> str:
        chat_model = self.load_model()
        res = await chat_model.ainvoke(prompt)
        return res.content

    def get_model_name(self):
        return "Custom Azure OpenAI Model"

# Replace these with real values
custom_model = BedrockChat(
    credentials_profile_name=<your-profile-name>, # e.g. "default"
    region_name=<your-region-name>, # e.g. "us-east-1"
    endpoint_url=<your-bedrock-endpoint>, # e.g. "https://bedrock-runtime.us-east-1.amazonaws.com"
    model_id=<your-model-id>, # e.g. "anthropic.claude-v2"
    model_kwargs={"temperature": 0.4},
)

aws_bedrock = AWSBedrock(model=custom_model)
print(aws_bedrock.generate("Write me a joke"))
```

Finally, supply the newly created `aws_bedrock` model to LLM-Evals:

```python
from deepeval.metrics import AnswerRelevancyMetric
...

metric = AnswerRelevancyMetric(model=aws_bedrock)
```

## JSON Confinement for Custom LLMs

:::tip
This section is also highly applicable if you're looking to [benchmark your own LLM](/docs/benchmarks-introduction), as open-source LLMs often require JSON and output confinement to output valid answers for public benchmarks supported by `deepeval`.
:::

In the previous section, we learnt how to create a custom LLM, but if you've ever used custom LLMs for evaluation in `deepeval`, you may have encountered the following error:

```bash
ValueError: Evaluation LLM outputted an invalid JSON. Please use a better evaluation model.
```

This error arises when the custom LLM used for evaluation is unable to generate valid JSONs during metric calculation, which stops the evaluation process altogether. This happens because for smaller and less powerful LLMs, prompt engineering alone is not sufficient to enforce JSON outputs, which so happens to be the method used in `deepeval`'s metrics. As a result, it's vital to find a workaround for users not using OpenAI's GPT models for evaluation.

:::info
All of `deepeval`'s metrics require the evaluation model to generate valid JSONs to extract properties such as: reasons, verdicts, statements, and other types of LLM-generated responses that are later used for calculating metric scores, and so when the generated JSONs required to extract these properties are invalid (eg. missing brackets, incomplete string quotations, extra trailing commas, or mismatched keys), `deepeval` won't be able to use the necessary information required for metric calculation. Here's an example of an invalid JSON an open-source model like `mistralai/Mistral-7B-Instruct-v0.3` might output:

```bash
{
    "reaso: "The actual output does directly not address the input",
}
```

:::

### Rewriting the `generate()` and `a_generate()` Method Signatures

In the previous section, we saw how the `generate()` and `a_generate()` methods must accept _one_ argument of type `str` and return the corresponding LLM generated `str`. To enforce JSON outputs generated by your custom LLM, the first step is to rewrite the `generate()` and `a_generate()` method to **accept an additional argument of type `BaseModel`, and output a `BaseModel` instead of a `str`.**

:::note
The `BaseModel` type is a type provided by the `pydantic` library, which is an extremely common typing library in Python.

```python
from pydantic import BaseModel
```

:::

Continuing from the `CustomLlama3_8B` example, here is what the method signature for the new `generate()` and `a_generate()` methods should look like:

```python
from pydantic import BaseModel

class CustomLlama3_8B(DeepEvalBaseLLM):
    ...

    def generate(self, prompt: str, schema: BaseModel) -> BaseModel:
        pass

    async def a_generate(self, prompt: str, schema: BaseModel) -> BaseModel:
        return self.generate(prompt, schema)
```

You might be wondering, **how does changing the method signature help with enforcing JSON outputs?**

It helps because in `deepeval`'s metrics, when there is a `schema: BaseModel` argument defined for the `generate()` and/or `a_generate()` method, `deepeval` will inject your generate methods with the Pydantic schemas which you can leverage to enforce JSON outputs. Let's see how we can do that.

### Reimplementing the `generate()` and `a_generate()` Methods

With the new method signatures, `deepeval` will now automatically inject your custom LLM with the required Pydantic schemas, which you can leverage to enforce JSON outputs for each LLM generation.

There are many ways to leverage Pydantic schemas to confine LLMs to generate valid JSONs, and continuing with our `CustomLlama3_8B` example we will be using the `lm-format-enforcer` library to confine JSON outputs using the provided Pydantic schema.

```bash
pip install lm-format-enforcer
```

```python
import json
import transformers
from pydantic import BaseModel
from lmformatenforcer import JsonSchemaParser
from lmformatenforcer.integrations.transformers import (
    build_transformers_prefix_allowed_tokens_fn,
)

from deepeval.models import DeepEvalBaseLLM


class CustomLlama3_8B(DeepEvalBaseLLM):
    ...

    def generate(self, prompt: str, schema: BaseModel) -> BaseModel:
        # Same as the previous example above
        model = self.load_model()
        pipeline = transformers.pipeline(
            "text-generation",
            model=model,
            tokenizer=self.tokenizer,
            use_cache=True,
            device_map="auto",
            max_length=2500,
            do_sample=True,
            top_k=5,
            num_return_sequences=1,
            eos_token_id=self.tokenizer.eos_token_id,
            pad_token_id=self.tokenizer.eos_token_id,
        )

        # Create parser required for JSON confinement using lmformatenforcer
        parser = JsonSchemaParser(schema.model_json_schema())
        prefix_function = build_transformers_prefix_allowed_tokens_fn(
            pipeline.tokenizer, parser
        )

        # Output and load valid JSON
        output_dict = pipeline(prompt, prefix_allowed_tokens_fn=prefix_function)
        output = output_dict[0]["generated_text"][len(prompt) :]
        json_result = json.loads(output)

        # Return valid JSON object according to the schema DeepEval supplied
        return schema(**json_result)

    async def a_generate(self, prompt: str, schema: BaseModel) -> BaseModel:
        return self.generate(prompt, schema)

```

:::tip
We're calling `self.generate(prompt, schema)` in the `a_generate()` method to keep things simple, but you should aim to implement an asynchronous version of your custom LLM implementation and enforce JSON outputs the same way you would in the `generate()` method to keep evaluations fast.
:::

Now, try running metrics with the new `generate()` and `a_generate()` methods:

```python
from deepeval.metrics import AnswerRelevancyMetric
...

custom_llm = CustomLlama3_8B()
metric = AnswerRelevancyMetric(model=custom_llm)
metric.measure(...)
```

**Congratulations 🎉!** You can now evaluate using any custom LLM of your choice on all LLM evaluation metrics offered by `deepeval`, without JSON errors (hopefully).

In the next section, we'll go through two JSON confinement libraries that covers a wide range of LLM interfaces.

## JSON Confinement libraries

There are two JSON confinement libraries that you should know about depending on the custom LLM you're using:

1. `lm-format-enforcer`: The **LM-Format-Enforcer** is a versatile library designed to standardize the output formats of language models. It supports Python-based language models across various platforms, including popular frameworks such as `transformers`, `langchain`, `llamaindex`, llama.cpp, vLLM, Haystack, NVIDIA, TensorRT-LLM, and ExLlamaV2. For comprehensive details about the package and advanced usage instructions, [please visit the LM-format-enforcer github page](https://github.com/noamgat/lm-format-enforcer). The LM-Format-Enforcer combines a **character-level parser** with a **tokenizer prefix tree**. Unlike other libraries that strictly enforce output formats, this method enables LLMs to sequentially generate tokens that meet output format constraints, thereby enhancing the quality of the output.

2. `instructor`: **Instructor** is a user-friendly python library built on top of Pydantic. It enables straightforward confinement of your LLM's output by encapsulating your LLM client within an Instructor method. It simplifies the process of extracting structured data, such as JSON, from LLMs including GPT-3.5, GPT-4, GPT-4-Vision, and open-source models like Mistral/Mixtral, Anyscale, Ollama, and llama-cpp-python. For more information on advanced usage or integration with other models not covered here, [please consult the documentation](https://github.com/jxnl/instructor).

:::note
You may wish to wish any JSON confinement libraries out there, and we're just suggesting two that we have found useful when crafting this guide.
:::

In the final section, we'll show several popular end-to-end examples of custom LLMs using either `lm-format-enforcer` or `instructor` for JSON confinement.

## More Examples

### `Mistral-7B-Instruct-v0.3` through `transformers`

Begin by installing the `lm-format-enforcer` package:

```bash
pip install lm-format-enforcer
```

Here's a full example of a JSON confined custom Mistral 7B model implemented through `transformers`:

```python
import json
from pydantic import BaseModel
import torch
from lmformatenforcer import JsonSchemaParser
from lmformatenforcer.integrations.transformers import (
    build_transformers_prefix_allowed_tokens_fn,
)
from transformers import BitsAndBytesConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

from deepeval.models import DeepEvalBaseLLM


class CustomMistral7B(DeepEvalBaseLLM):
    def __init__(self):
        quantization_config = BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_compute_dtype=torch.float16,
            bnb_4bit_quant_type="nf4",
            bnb_4bit_use_double_quant=True,
        )

        model_4bit = AutoModelForCausalLM.from_pretrained(
            "mistralai/Mistral-7B-Instruct-v0.3",
            device_map="auto",
            quantization_config=quantization_config,
        )
        tokenizer = AutoTokenizer.from_pretrained(
            "mistralai/Mistral-7B-Instruct-v0.3"
        )

        self.model = model_4bit
        self.tokenizer = tokenizer

    def load_model(self):
        return self.model

    def generate(self, prompt: str, schema: BaseModel) -> BaseModel:
        model = self.load_model()
        pipeline = pipeline(
            "text-generation",
            model=model,
            tokenizer=self.tokenizer,
            use_cache=True,
            device_map="auto",
            max_length=2500,
            do_sample=True,
            top_k=5,
            num_return_sequences=1,
            eos_token_id=self.tokenizer.eos_token_id,
            pad_token_id=self.tokenizer.eos_token_id,
        )

        # Create parser required for JSON confinement using lmformatenforcer
        parser = JsonSchemaParser(schema.model_json_schema())
        prefix_function = build_transformers_prefix_allowed_tokens_fn(
            pipeline.tokenizer, parser
        )

        # Output and load valid JSON
        output_dict = pipeline(prompt, prefix_allowed_tokens_fn=prefix_function)
        output = output_dict[0]["generated_text"][len(prompt) :]
        json_result = json.loads(output)

        # Return valid JSON object according to the schema DeepEval supplied
        return schema(**json_result)

    async def a_generate(self, prompt: str, schema: BaseModel) -> BaseModel:
        return self.generate(prompt, schema)

    def get_model_name(self):
        return "Mistral-7B v0.3"
```

:::note
Similar to the `CustomLlama3_8B` example, you can similarly:

- pass in a `quantization_config` parameter if your compute resources are limited
- use the `lm-format-enforcer` library for JSON confinement

This is because the `CustomMistral7B` model is implemented through HF `transformers` as well.

:::

### `gemini-1.5-flash` through Vertex AI

Begin by installing the `instructor` package via pip:

```bash
pip install instructor
```

```python
from pydantic import BaseModel
import google.generativeai as genai
import instructor

from deepeval.models import DeepEvalBaseLLM


class CustomGeminiFlash(DeepEvalBaseLLM):
    def __init__(self):
        self.model = genai.GenerativeModel(model_name="models/gemini-1.5-flash")

    def load_model(self):
        return self.model

    def generate(self, prompt: str, schema: BaseModel) -> BaseModel:
        client = self.load_model()
        instructor_client = instructor.from_gemini(
            client=client,
            mode=instructor.Mode.GEMINI_JSON,
        )
        resp = instructor_client.messages.create(
            messages=[
                {
                    "role": "user",
                    "content": prompt,
                }
            ],
            response_model=schema,
        )
        return resp

    async def a_generate(self, prompt: str, schema: BaseModel) -> BaseModel:
        return self.generate(prompt, schema)

    def get_model_name(self):
        return "Gemini 1.5 Flash"
```

:::info
The `instructor` client automatically allows you to create a structured response by defining a `response_model` parameter which accepts a Pydantic `BaseModel` schema.
:::

### `claude-3-opus` through Anthropic

Begin by installing the `instructor` package via pip:

```bash
pip install instructor
```

```python
from pydantic import BaseModel
from anthropic import Anthropic

from deepeval.models import DeepEvalBaseLLM


class CustomClaudeOpus(DeepEvalBaseLLM):
    def __init__(self):
        self.model = Anthropic()

    def load_model(self):
        return self.model

    def generate(self, prompt: str, schema: BaseModel) -> BaseModel:
        client = self.load_model()
        instructor_client = instructor.from_anthropic(client)
        resp = instructor_client.messages.create(
            model="claude-3-opus-20240229",
            max_tokens=1024,
            messages=[
                {
                    "role": "user",
                    "content": prompt,
                }
            ],
            response_model=schema,
        )
        return resp

    async def a_generate(self, prompt: str, schema: BaseModel) -> BaseModel:
        return self.generate(prompt, schema)

    def get_model_name(self):
        return "Claude-3 Opus"
```

### Others

For any additional implementations, please come and ask away in the [DeepEval discord server](https://discord.com/invite/a3K9c8GRGt), we'll be happy to have you.
